/************************************************************************************
 *______________________________________Depth_______________________________________*
 ************************************************************************************/

/*
PURPOSE:

    The goal of this script is to calculate average depth among the Top 8 exchanges for symbol-dates
    in 2015 from the Daily TAQ Quotes file. This script also carries the option of calculating
    depth at finer time intervals than the trading day (e.g. 5 min, 1 hour, 1 min, etc.).

    A row in the output of this script would be, for example, the average depth of NYSE
    for SPY on March 8, 2015.

DEFINITIONS:

    We define depth at the millisecond as the average of the exchange’s quantity
    at the best ask and the exchange’s quantity at the best bid as of the end of the millisecond
    [quantity can be zero if the exchange is not at the best bid or best ask]. The best bid
    and the best ask are not necessarily the NBBO, but rather the BBO among the Top 8 exchanges.

    We define total, time-weighted depth to be the depth observed in a given millisecond
    multiplied by the total number of milliseconds for which this depth was on the book.
    Since we don't observe activity in every millisecond, this is how we compute the sum
    of the depth at every millisecond.

    We define average depth for a symbol-date-exchange to be the total, time-weighted depth
    divided by the number of milliseconds with a valid BBO.

PRIMARY INPUT:

    The Daily TAQ Quotes file is a dataset where a row is a single quote. Each row contains
    the date the quote was executed, the timestamp (millisecond precision before 8/3/2015,
    microsecond precision afterwards), the symbol root (name) and symbol suffix (equity class),
    the exchange where the quote was executed, the bid, bid size, ask, ask size,
    and the quote condition code that describes the context of the quote (e.g. an opening quote,
    regular quote, slow on the ask side, etc.).

APPROACH:

    The approach can be summarized as follows:

    STEP 1: We load the quotes for a single day and drop the quotes that were not from the
    Top 8 exchanges, made outside of regular trading hours (adjusted for halfdays),
    and during holidays. Out of this set of quotes we define quotes satisfying the following
    conditions as regular quotes. Only regular quotes are eligible for the BBO.

    1. The quote had a valid record and was not been cancelled or corrected.
    2. The quote is labelled as "regular, two-sided" by the quote sale condition code.

    STEP 2: We group the quotes by date (interval), symbol, ms and exchange.
    At every ms, we compute the BBO as the maximum bid and minimum ask among the
    current quotes as of the end ms.

    STEP 3: We compute the depth for an exchange at the ms by first evaluating whether
    it is at the best bid or ask. If it is at both, depth is the average of its bid and ask size.
    If the exchange is only at the BBO  on one side, its depth is the size of its current
    quote at that side.

    STEP 4: We compute total, time-weighted depth by taking the sum of depth at every ms
    multiplied by the depth that depth is active. We compute average depth as the total depth
    divided by the number of valid ms from the time of the first quote to the end
    of the trading day.

    The code is structured as a large loop over each quote in the day, so although the algorithm
    can be conceptually broken down into three steps (calculate BBO, depth at millisecond, average depth),
    all of these calculations are updated as we process quotes.

    The output is a symbol-date-exchange level dataset of the depth of each exchange for each
    symbol for the given day. This is repeated for every day for every month in 2015.

SUPPLEMENTAL INPUT FILES:

    "Server_Directories.xlsx"
    "Holidays.xlsx"

PREREQ:

    Go to the section named "GLOBAL VARIABLES" and find the statement that declares
    "Input_Exchanges_for_BBO". Verify that these are the exchanges you want to include
    when calculating volume. Also verify that "exchange_index_string" contains the
    exchanges you included in "Input_Exchanges_for_BBO".

    Find the statement that declares "Interval_Length_Input". Set this to the time interval
    length of your choice using the format ("hh:mm:ss.xxx"t).
    The default is "06:30:00.000", which is the length of a full trading day.
    For halfdays, the interval length is set to "4:00:00.000", which is the length
    of the trading day during halfdays.

INSTRUCTION:

    Find the wrapper "Depth.sh". Make sure that in this file, the argument after
    "-sysparm" is "2015". Submit the wrapper on the WRDS server.
*/

/*****************************************************************************************
 *_______________________________________INITIALIZE______________________________________*
 *****************************************************************************************/

/*
 * WRDS LIBRARY:
 *
 * This sets up the wharton library, which allows us to access the CRSP and TAQ datasets.
 */
%include '/wrds/lib/utility/wrdslib.sas';
options mprint;

/* This allows the "in" operator to be used inside of macros. */
options minoperator;

/*
 * IMPORT INPUT PARAMETERS:
 *
 * Make sure to enter the path to your input directory below.
 */
proc import datafile="/home/uchicago/eche/sasuser.v94/WRDS/Input/Server_Directories" out= server_directories dbms = xlsx replace;
run;
proc import datafile="/home/uchicago/eche/sasuser.v94/WRDS/Input/Holidays" out= holidays dbms = xlsx replace;
run;

data holiday_import;
  set holidays;

  if holiday = "New_Years" then do; call symput('New_Years', trim(date)); end;
  if holiday = "MLK_Day" then do; call symput('MLK_Day', trim(date)); end;
  if holiday = "Presidents_Day" then do; call symput('Presidents_Day', trim(date)); end;
  if holiday = "Good_Friday" then do; call symput('Good_Friday', trim(date)); end;
  if holiday = "Memorial_Day" then do; call symput('Memorial_Day', trim(date)); end;
  if holiday = "Independence_Day" then do; call symput('Independence_Day', trim(date)); end;
  if holiday = "Labor_Day" then do; call symput('Labor_Day', trim(date)); end;
  if holiday = "Thanksgiving" then do; call symput('Thanksgiving', trim(date)); end;
  if holiday = "Christmas" then do; call symput('Christmas', trim(date)); end;
  if holiday = "Hurricane_Sandy" then do; call symput('Hurricane_Sandy', trim(date)); end;
  if holiday = "Halfdays" then do; call symput('Halfdays', trim(date)); end;
run;

data directory_import;
  set server_directories;

  if directory = "user_directory" then do; call symput('userdir', trim(path)); end;
  if directory = "volume_directory" then do; call symput('voldir', trim(path)); end;
  if directory = "temp_directory" then do; call symput('tempdir', trim(path)); end;
run;

/**
 * Directories:
 *
 * Sets up the directories according to the paths in the "Server_Directories.xlsx" file.
 */
libname userdir &userdir;
libname voldir &voldir;
libname tempdir &tempdir;

/* Set user_path for import/export .
 * We need to dequote the path passed in from "Server_Directories.xlsx".
 */
%let user_path  = %qsysfunc(dequote(&userdir));

/**********************************************************************************************
 *______________________________________GLOBAL VARIABLES______________________________________*
 **********************************************************************************************/

/* Contains the year and the interval length. Uncomment to run from the SAS Studio website. */
/* %let sysparm = "2015"; */

/* Obtain the year from wrapper. */
%let YEAR_INPUT = &SYSPARM;

/* Set interval length. */
%let interval_length_input = "06:30:00.000"t;

/* Trade beginning and end times. */
%let trade_begin_time = "09:30:00.000"t;
%let trade_end_time_reg = "16:00:00.000"t;
%let trade_end_time_halfday = "13:00:00.000"t;

/* Length of the trading day. */
%let regular_trading_day_length = "06:30:00.000"t;
%let halfday_trading_length = "03:30:00.000"t;

/*
 * Set constant to convert seconds to milliseconds.
 * Set constant to convert microseconds to milliseconds.
 */
%let seconds_to_milliseconds = 1000;
%let micro_to_milliseconds = 1000;

/* Set constant for rounding to milliseconds. */
%let millisecond_precision = 0.001;

/*
 * Exchanges:
 *
 * "Input_Exchanges_for_BBO" is the list of exchanges which are included
 * when calculating the Best Bid or Offer.
 * If you want to restrict to the top 5 or enlarge the exchange pool,
 * then change the exchange codes in "Input_Exchanges_for_BBO".
 *
 * The following lists which code belongs to each exchange:
 *
 * Top 8:
 *
 * 'N': NYSE
 * 'P': NYSE Arca
 * 'T': Nasdaq (Tape A and B)
 * 'Z': Bats BZX
 * 'K': Bats EDGX
 * 'B': Nasdaq (BX)
 * 'J': Bats EDGA
 * 'Y': Bats BYX
 * 'Q': Nasdaq (Tape C, all the volume under this exchange
 * is moved to 'T' during this calculation)
 *
 * Other Exchanges with non-zero trading in 2015:
 * 'A': NYSE Market
 * 'X': Nasdaq PSX
 * 'M': CHX
 */
%global Input_Exchanges_for_BBO;
%let Input_Exchanges_for_BBO = ('N','P','T','Z','K');

/*
 * "exchange_index_string" is used to index the arrays when storing the
 * bids and asks at each exchange.
 *
 * This means that in the bid array, for example, the first entry will be NYSE's
 * bid since the first code in the string ('N') is the code for NYSE.
 * The order of this string doesn't have to align with the order of
 * "Input_Exchanges_for_BBO", since the order of "Input_Exchanges_for_BBO" does not matter.
 *
 * However, any exchange specified here should also be present in "Input_Exchanges_for_BBO",
 * because otherwise it will not appear at all in the output.
 *
 * NOTE: 'Q' is in "Input_Exchanges_for_BBO" but not in "exchange_index_string".
 * This is because we will combine 'T' and 'Q' (both NASDAQ) into 'T'.
 */
%global exchange_index_string;
%let exchange_index_string = NPTZK;

/* Number of exchanges in the exchange code string. */
%global nEx;
%let nEx = %length(&exchange_index_string);

/***********************************************************************************
 *______________________________________MACRO______________________________________*
 ***********************************************************************************/

/*
 * _MACRO_ : Daily_Depth(YYYYMMDD)
 *
 * PARAM: YYYYMMDD: The day for which this macro calculates depth
 *
 * This Macro first loads the Daily TAQ Quotes file and restricts the data
 * to quotes that occur during regular trading hours, not on a holiday, and
 * from a specified set of exchanges. We then label quotes as regular quotes
 * if they have the "regular" quote condition code and the "Not a cancel quote"
 * quote correction code. We only use regular quotes when determining the Best Bid and Offer.
 *
 * After we've filtered the quotes dataset, we group them by date, (interval if relevant),
 * symbol, ms, and exchange. Then we compute depth using the following steps.
 *
 * [a] Time-weighted depth at previous ms:
 *     First, at the first quote of the ms, we measure the duration between
 *     this ms and the previous one (if this is the first ms we skip this step).
 *     If a valid BBO exists (two sided, not locked or crossed) then we calculate
 *     the depth in the previous ms and multiply the depth by this duration.
 *     We add this time-weighted depth to running totals for each exchange.
 *     This is how we compute total depth in STEP 3 in the header.
 *
 *     We don't start counting anything until the first quote of the day.
 *     For the first ms of quoting, we skip step [a] but continue to [b]-[d]
 *     so we can compute the time-weighted depth at the first ms once we are
 *     at the next ms.
 *
 * [b] Calculate BBO:
 *     Second, for every quote in the ms, we update the current bid/ask
 *     for the exchange in the quote. For irregular quotes, assign the bid/ask to null.
 *     Then we update the current BBO as the maximum bid and minimum ask of all prevailing quotes.
 *     This is STEP 1 of the header.
 *
 * [c] Exchanges at BBO:
 *     Third, at the end of the ms, we consider the current BBO as of the end of the ms
 *     as the BBO for the ms. We evaluate whether each exchange's current quote as of the ms
 *     is at the BBO. This is STEP 2 of the header.
 *
 * [d] Calculate average depth:
 *     At the end of the trading day, we calculate average depth for each exchange by
 *     dividing the running totals of depth by the total number of valid ms
 *     the number of ms elapsed between the first quote of the day and the end of
 *     the trading day with a valid BBO (not "LockedOrCrossed", not "noTwoSideMarket")).
 *     This is how we calculate average depth in STEP 3 of the header.
 *
 * OUTPUT: Day_Depth_&YYYYMMDD, symbol-date(interval)-exchange level depth dataset
 */

/*
 * Note: "mindelimiter" is an option that allows us to use the "in" operator within macros.
 * This delimiter specifies that when using the "in" operator (used to evaluate if an object is in a list),
 * the list this operator is used on will be separated by commas.
 */
%Macro Daily_Depth(YYYYMMDD)/mindelimiter=',';
    /*
     * Halfday correction:
     *
     * Using the list of halfdays in the "Holidays.xlsx" input file,
     * we identify whether the current day is a halfday or not.
     * If so, then we change the end of the trading day to 1:00 pm.
     */
    %let DDMMYYYY = "%sysfunc(inputn(&YYYYMMDD, yymmdd10.), date9.)"d;
    %if &DDMMYYYY in &halfdays %then %do;
        %let trade_end_time = &trade_end_time_halfday;
    %end;
    %else %do;
        %let trade_end_time = &trade_end_time_reg;
    %end;
    %put &trade_end_time;

    /*
     * If it is a half day and if the interval is equal to a full trading day,
     * we adjust the interval length to be the length of a half day.
     */
    %if &DDMMYYYY in &halfdays %then %do;
        %if &interval_length_input = &regular_trading_day_length %then %do;
            %let interval_length = &halfday_trading_length;
        %end;
        %else %do;
            %let interval_length = &interval_length_input;
        %end;
    %end;
    %else %do;
        %let interval_length = &interval_length_input;
    %end;
    %put &interval_length;

    /*
     * f_Exchanges
     *
     * We only include quotes made by exchanges in the list in "Input_Exchanges_for_BBO".
     */
    %let f_Exchanges = ex in &Input_Exchanges_for_BBO;

    /* f_Holiday
     *
     * We remove holidays from our sample.
     */
    %let f_New_Years_Day = date in &New_Years;
    %let f_MLK_Day=date in &MLK_Day;
    %let f_Presidents_Day=date in &Presidents_Day;
    %let f_Good_Friday=date in &Good_Friday;
    %let f_Memorial_Day=date in &Memorial_Day;
    %let f_Indpdnce_Day=date in &Independence_Day;
    %let f_Labor_Day=date in &Labor_Day;
    %let f_Thanksgiving=date in &Thanksgiving;
    %let f_Christmas_Day=date in &Christmas;

    %let f_Holiday = (&f_New_Years_Day or
                    &f_MLK_Day or
                    &f_Presidents_Day or
                    &f_Good_Friday or
                    &f_Memorial_Day or
                    &f_Indpdnce_Day or
                    &f_Labor_Day or
                    &f_Thanksgiving or
                    &f_Christmas_Day);

    /*
     * f_Regular_Hours
     *
     * We only include quotes that occur during regular trading hours.
     */
    %let f_Regular_Hours = time_m >= &trade_begin_time and time_m < &trade_end_time;

    /*
     * INPUT: taqmsec.cqm_&YYYYMMDD, Daily TAQ Quotes file
     *
     * Filters out the Daily TAQ Quotes File according to the conditions set above.
     * Identifies regular quotes with "f_Regular_Quote."
     * Assigns an interval to each quote.
     *
     * OUTPUT: quotes, Daily TAQ Quotes File with quote flags
     */
    data quotes;
        set taqmsec.cqm_&YYYYMMDD.;
        where ((&f_Regular_Hours) AND (&f_Exchanges) and (~&f_Holiday));

        /* This makes sure we are measuring time in ms. */
        time_m = floor(time_m * &micro_to_milliseconds)/&micro_to_milliseconds;

        /* We move all the NASDAQ Quotes into one exchange code. */
        if ex = 'Q' then ex = 'T';

        /*
         * Calculate which interval a trade falls in from the ms timestamp.
         * For a 6.5 hr interval (length of the trading day), all trades are in interval 0.
         */
        interval =  floor((time_m - &trade_begin_time)/&interval_length);
        interval_end_time = (&trade_begin_time + &interval_length * (interval + 1));

        /* f_Regular_Condition:
         * Quotes with the quote condition "R": Regular, two-sided quotes.
         */
        f_Regular_Condition = qu_cond ='R';

        /*
         * f_Valid_Record
         *
         * qu_cancel:
         * "" = Not a cancel quote (CTA Only)
         * "A" = Not a cancel quote (UTP Only)
         */
        f_Valid_Record = qu_cancel in ("", "A");

        /* We define a regular quote to be a regular, two-sided quote that has a valid record. */
        f_Regular_Quote = f_Regular_Condition and f_Valid_Record;

        format date yymmdd10.;
        format time_m time12.3;

        keep date time_m ex sym_root sym_suffix bid ask bidsiz asksiz qu_cond interval interval_end_time interval_len;
    run;

    /*
     * We sort the quotes by symbol, date, interval, ms, and exchange.
     *
     * Note: Hasbrouck (2010) points out that the BBO calculation can be wrong
     * if the sequence of quotes within an exchange is changed.
     *
     * The EQUALS option for proc sort preserves the order of the original dataset
     * within the by groups. EQUALS is default for proc sort but we
     * specify it here for clarity.
     */
    proc sort data = quotes out = quotes equals;
      by sym_root sym_suffix date interval time_m ex;
    run;

    /*
     * INPUT: quotes, Daily TAQ Quotes File with quote flags
     *
     * We group the quotes dataset by symbol, date, (interval if used), ms, and exchange.
     * We then compute the BBO and depth at every ms, along with the average depth for the day,
     * with the steps described in the macro comment.
     *
     * OUTPUT: Depth_&YYYYMMDD, symbol-date(interval)-exchange depth data
     */
    data Depth_&YYYYMMDD.;
        set quotes;
        by sym_root sym_suffix date interval time_m ex;

        /**
         * ______________________________________INITIALIZATION______________________________________
         */

        /*
         * exBid: array containing the current bid at each exchange
         * exAsk: array containing the current ask at each exchange
         * exBidSize: array containing the current size of the bid at each exchange
         * exAskSize: array containing the current size of the ask at each exchange
         * f_exAtBestBid: array containing flags for each exchange that indicate
         *  whether the exchange is at the best bid by the end of the ms
         * f_exAtBestAsk: array containing flags for each exchange that indicate
         *  whether the exchange is at the best ask by the end of the ms
         * exCombinedDepth: array containing the running total of combined depth
         *  weighted by the number of ms the depth is active for each exchange
         * exBidDepth: array containing the running total of bid depth
         *  weighted by the number of ms the depth is active for each exchange
         * exAskDepth: array containing the running total of ask depth
         *  weighted by the number of ms the depth is active for each exchange
         */

        array exBid(&nEx);
        array exAsk(&nEx);
        array exBidSize(&nEx);
        array exAskSize(&nEx);
        array f_exAtBestBid(&nEx);
        array f_exAtBestAsk(&nEx);
        array exCombinedDepth(&nEx);
        array exBidDepth(&nEx);
        array exAskDepth(&nEx);

        /* retain saves the quote variables across iterations */
        retain exBid1-exBid&nEx exAsk1-exAsk&nEx exBidSize1-exBidSize&nEx exAskSize1-exAskSize&nEx
        f_exAtBestAsk1-f_exAtBestAsk&nEx f_exAtBestBid1-f_exAtBestBid&nEx
        exCombinedDepth1-exCombinedDepth&nEx exBidDepth1-exBidDepth&nEx exAskDepth1-exAskDepth&nEx
        BBid BAsk prevBB prevBO prevLockedOrCrossed prevNoTwoSideMarket prevTime
        duration_LockedOrCrossed duration_noTwoSide time_firstQuote
        LockedOrCrossed noTwoSideMarket .;

        /*
         * ______________________________________FIRST QUOTE OF THE INTERVAL______________________________________
         */

        /* Reset variables at the beginning of every interval. */

        if first.interval then do;
            do i=1 to &nEx;
                exBid(i) = .;
                exAsk(i) = .;
                exBidSize(i) = 0;
                exAskSize(i) = 0;
                exBidDepth(i) = 0;
                exAskDepth(i) = 0;
                exCombinedDepth(i) = 0;
                f_exAtBestAsk(i) = .;
                f_exAtBestBid(i) = .;
            end;
            BBid = .;
            BAsk = .;
            prevBO = .;
            prevBB = .;
            prevLockedOrCrossed = .;
            prevNoTwoSideMarket = .;
            prevTime = time_m;
            time_firstQuote = time_m;
            duration_LockedOrCrossed = 0;
            duration_noTwoSide = 0;
            drop i;
        end;

        /*
         * ______________________________________WITHIN THE INTERVAL______________________________________
         */

        /* If the quote is not a regular quote, then we set the bid and the ask to null. */
        if f_Regular_Quote = 0 then do;
            bid = .;
            bidsiz = 0;
            ask = .;
            asksiz = 0;
        end;

        /*
         * Set negative, null, or zero bids/asks to null.
         * Set the sizes of such bids/asks to 0 too.
         *
         * Bids, bidsizes, asks, and asksizes have not been observed to be negative in the quotes dataset
         * for a random sample of days (20150812, 20150206, and 20151221).
         * This is to ensure that the if such quotes arrive, they are ineligible for the BBO
         */
        if bid <= 0 or bidsiz <= 0 or bid = . then do;
            bid = .;
            bidsiz = 0;
        end;
        if ask <= 0 or asksiz <= 0 or ask = . then do;
            ask = .;
            asksiz = 0;
        end;

        /*
         * ______________________________________FIRST QUOTE OF THE MILLISECOND______________________________________
         */

        /*
         * [a] Time-weighted depth at previous ms:
         *
         * This code calculates the depth at the previous ms (only if this is
         * not the first ms of the interval) multiplied by the duration that depth was active.
         * Duration is calculated as the number of ms between this ms
         * and the previous ms.
         *
         * If an exchange was at the best bid/ask by the end of the last ms, we calculate its bid/ask depth
         * and multiply it by the duration. For the combined depth, we take the average of its bid and ask depth
         * (if the exchange is at the best bid and ask) and multiply it by the duration.
         * We add this to the running totals of ask depth, bid depth, and combined depth.
         */
        if first.time_m and not first.interval then do;

            /*
             * Note that duration is measured in ms.
             * We round by millisecond_precision = 0.001 to ensure we measure milliseconds as a natural number.
             */
            duration = round(time_m - prevTime, &millisecond_precision) * &seconds_to_milliseconds;

          /*
           * If the market was locked or missing a best bid or ask at the previous ms,
           * we add this duration to "duration_LockedOrCrossed" or "duration_noTwoSide".
           */
          if prevLockedOrCrossed = 1 OR prevNoTwoSideMarket = 1 then do;
            duration_LockedOrCrossed = duration_LockedOrCrossed + prevLockedOrCrossed * duration;
            duration_noTwoSide = duration_noTwoSide + prevNoTwoSideMarket * duration;
          end;
          else do;
            /* Calculate bid depth, ask depth, and combined depth and add to totals. */
            do i = 1 to &nEx;
              exBidDepth(i) = exBidDepth(i) + exBidSize(i) * f_exAtBestBid(i) * duration;
              exAskDepth(i) = exAskDepth(i) + exAskSize(i) * f_exAtBestAsk(i) * duration;
              exCombinedDepth(i) = exCombinedDepth(i) + ((exBidSize(i) * f_exAtBestBid(i) + exAskSize(i) * f_exAtBestAsk(i)) / 2) * duration;
            end;
          end;
        end;

        /*
         * ______________________________________FOR EVERY QUOTE DURING THE MILLISECOND______________________________________
         */

        /*
         * [b] Calculate BBO:
         *
         * Within the ms, we iterate through all the quotes
         * and store the most recent bid and ask in the "exBid" and "exAsk" arrays.
         * With every quote we update the Best Bid and the Best Ask
         * by taking the max of the bid array and the min of the ask array respectively.
         *
         * kEx specifies which index in each array corresponds to which exchange,
         * as specified in "exchange_index_string".
         */
        kEx = index("&exchange_index_string",ex);

        if kEx > 0 then do;
            exBid(kEx) = bid;
            exAsk(kEx) = ask;
            exBidSize(kEx) = bidsiz;
            exAskSize(kEx) = asksiz;
            BBid = max(of exBid1-exBid&nEx);
            BAsk = min(of exAsk1-exAsk&nEx);
        end;

        /*
         * noTwoSideMarket: 1 if there is no best bid or best ask.
         * LockedOrCrossed: 1 if the best bid is above the best ask.
         */
        noTwoSideMarket = (BBid = . OR BAsk = .);
        LockedOrCrossed = (BBid >= BAsk) * (noTwoSideMarket = 0);

        /*
         * ______________________________________LAST QUOTE OF THE MILLISECOND______________________________________
         */

        /*
         * [c] Exchanges at BBO:
         *
         * An exchange is considered to be at the BBO
         * if the last quote by the exchange during the ms is at the BBO.
         *
         * We save the best bid, best ask, and whether the market was locked or not two sided,
         * so that in the next ms we can calculate
         * the depth of each exchange and how long this depth was active.
         */
        if last.time_m then do;
          do i = 1 to &nEx;
            f_exAtBestBid(i) = (exBid(i) = BBid);
            f_exAtBestAsk(i) = (exAsk(i) = BAsk);
          end;
          prevBB = BBid;
          prevBO = BAsk;
          prevLockedOrCrossed = LockedOrCrossed;
          prevNoTwoSideMarket = noTwoSideMarket;
          prevTime = time_m;
        end;

        /*
         * ______________________________________LAST QUOTE OF THE INTERVAL______________________________________
         */

        if last.interval then do;

          /*
           * Upon the last quote of the interval, we record the time elapsed
           * between the time of the last quote and the end of the interval.
           */
          duration_last = round(interval_end_time - time_m, &millisecond_precision) * &seconds_to_milliseconds;

          /*
           * We record the total time elapsed in the whole interval,
           * from the time of the first quote.
           */
          sum_duration = round(interval_end_time - time_firstQuote, &millisecond_precision) * &seconds_to_milliseconds;

          /*
           * If the market is locked or crossed at the last ms of quoting,
           * we add the last duration of the interval to duration_LockedOrCrossed.
           */
          duration_LockedOrCrossed = duration_LockedOrCrossed + LockedOrCrossed * duration_last;

          /*
           * If there is no best bid or ask at the last ms of quoting,
           * we add the last duration of the interval to duration_noTwoSide.
           */
          duration_noTwoSide = duration_noTwoSide + noTwoSideMarket * duration_last;

          /*
           * We calculate the total milliseconds in which the market
           * was neither locked, crossed, or missing a bid or ask.
           *
           * If this duration is 0, then this is probably not a valid date, time interval or symbol,
           * so we set this duration to null.
           */
          sum_duration_noLocked_twoSide = sum_duration - duration_LockedOrCrossed - duration_noTwoSide;
          if sum_duration_noLocked_twoSide = 0 then sum_duration_noLocked_twoSide = .;

        /* Calculate bid depth, ask depth, and depth and add to totals. */
        do i = 1 to &nEx;
            if LockedOrCrossed = 0 AND noTwoSideMarket = 0 then do;
                exBidDepth(i) = exBidDepth(i) + exBidSize(i) * f_exAtBestBid(i) * duration_last;
                exAskDepth(i) = exAskDepth(i) + exAskSize(i) * f_exAtBestAsk(i) * duration_last;
                exCombinedDepth(i) = exCombinedDepth(i) + ((exBidSize(i) * f_exAtBestBid(i) + exAskSize(i) * f_exAtBestAsk(i)) / 2) * duration_last;
            end;

        /*
         * [d] Calculate average depth:
         *
         * We divide total, time-weighted depth (stored in exBidDepth, exAskDepth, and exCombinedDepth)
         * by sum_duration_noLocked_twoSide, which counts all the milliseconds in which there is a valid BBO
         * to get average depth for each exchange for every symbol.
         *
         * For each exchange we output:
         * ex: exchange code
         * avgDepth: average depth, which is calculated as the total depth of the exchange,
         *    divided by the total non-locked, two-sided duration
         * bidDepth: average bid depth, which is calculated as the total bid depth of the exchange,
         *    divided by the total non-locked, two-sided duration
         * askDepth: average ask depth, which is calculated as the total ask depth of the exchange,
         *    divided by the total non-locked, two-sided duration
         * valid_duration: total non-locked, two-sided duration of the interval (in ms)
         * locked_market_duration: total duration during which the best bid is above the best ask (in ms)
         * no_two_side_duartion: total duration during which only one side (bid or ask) is valid (in ms)
         */
              ex = substr("&exchange_index_string",i,1);
              avgDepth = exCombinedDepth(i) / sum_duration_noLocked_twoSide;
              bidDepth = exBidDepth(i) / sum_duration_noLocked_twoSide;
              askDepth = exAskDepth(i) / sum_duration_noLocked_twoSide;
              valid_duration = sum_duration_noLocked_twoSide;
              LockedOrCrossed_duration = duration_LockedOrCrossed;
              no_two_side_duration = duration_noTwoSide;
              output;
            end;
        end;
        else delete;

        keep date interval ex sym_root sym_suffix bidDepth askDepth avgDepth valid_duration LockedOrCrossed_duration no_two_side_duration;

    run;

    /*** Save the daily depth file in the volume directory. ***/
    data voldir.Day_Depth_top5_&YYYYMMDD.;
        set Depth_&YYYYMMDD.;
    run;

    /* Clear work library. */
    proc datasets library= work;
        delete Depth_&YYYYMMDD.;
        delete quotes;
    quit;

%Mend;

/*
 * _MACRO_ : Monthly_Depth(YYYYMM)
 *
 * PARAM: YYYYMM: The month for which this macro aggregates volume.
 *
 * REQUIRES: Daily_Depth
 *
 * This Macro iterates through all the non-zero quoting days in a given month
 * (denoted by YYYYMM) specified from the "TAQMSEC" library.
 * For each day it runs the Daily_Depth macro.
 * Afterwards, we load all the daily depth files into a monthly dataset, and delete the daily files.
 *
 * OUTPUT: Month_Depth_&YYYYMM, symbol-date(interval)-exchange level depth dataset
 */
%Macro Monthly_Depth(YYYYMM);

    /*
     * INPUT: sashelp.vtable, this is a table containing a row for all available datasets.
     * There's a column for "libname", the library the dataset belongs to, and "memname",
     * which is the name of the dataset.
     *
     * Extracts the Daily TAQ Quotes filenames for each day in the month.
     *
     * OUTPUT: Daily_Quotes_Filenames, list of filenames
     */
    proc sql;
        create table Daily_Quotes_Filenames
        as select
        libname,
        memname
        from sashelp.vtable where
        libname = "TAQMSEC" and memname like "CQM_&YYYYMM.%";
    quit;

    /*
     * Counts the number of filenames in Daily_Quotes_Filenames as "nobs".
     * Save a list of the filenames as "file_list", separated by "|".
     * These are used to iterate over the valid quoting days within the month.
     */
    proc sql print;
        select count(*) into: nobs from Daily_Quotes_Filenames;
        select memname into: file_list separated by '|' from Daily_Quotes_Filenames;
    quit;

    /* Loop over daily filenames.. */
    %do i = 1 %to &nobs.;

        /* Get the name of the days and the daily file names in the month. */
        %let cqm_file  = %scan(%quote(&file_list.), &i., '|');

        /* The date is extracted from the filename by taking the content after "CTM_". */
        %let YYYYMMDD  = %substr(&cqm_file., 5, 8);

        /* Apply Daily_Depth Macro. */
        %Daily_Depth(YYYYMMDD=&YYYYMMDD.);
    %end;

    /*** Save monthly file to the volume directory. ***/
    data voldir.Month_Depth_top5_&YYYYMM.;
        set voldir.Day_Depth_top5_&YYYYMM.:;
    run;

    /* Delete the daily files in the volume directory. */
    proc datasets library = voldir;
        delete Day_Depth_top5_&YYYYMM.:;
    run;

%Mend;

/*
 * _MACRO_ : Annual_Depth(YYYY)
 *
 * PARAM: YYYY: The month for which this macro aggregates volume.
 *
 * REQUIRES: Monthly_Depth
 *
 * This macro runs "Monthly_Depth" for every month in the year.
 * For 2007, we begin from October since that was when Regulation NMS was implemented.
 *
 * OUTPUT: Depth_Interval_&YYYY, symbol-date(interval)-exchange level depth dataset
 */
%Macro Annual_Depth(YYYY);
    %let month_list = 01|02|03|04|05|06|07|08|09|10|11|12;

    /*
     * Iterate across all months in the year for years after 2007.
     * Run Monthly_Depth for every month in the year.
     */
    %if (&YYYY. > 2007) %then %do;
        %do m = 1 %to 12;
            %let month = %scan(%quote(&month_list.), &m., '|');
            %Monthly_Depth(YYYYMM=&YYYY.&month.);
        %end;
    %end;

    /* For 2007, start data collection in october when REG NMS starts. */
    %if (&YYYY.=2007) %then %do;
        %do m = 10 %to 12;
            %let month = %scan(%quote(&month_list.), &m., '|');
            %Monthly_Depth(YYYYMM=&YYYY.&month.);
        %end;
    %end;

    /*** Save annual file to the volume directory. ***/
    data voldir.Depth_top5_&YYYY.;
        set voldir.Month_Depth_top5_&YYYY.:;
    run;

    /* Save in temporary directory too */
    data tempdir.Depth_top5_&YYYY.;
        set voldir.Depth_top5_&YYYY.;
    run;

    /* Delete the monthly files in the volume directory. */
    proc datasets library = voldir;
        delete Month_Depth_top5_&YYYY.:;
        run;
    quit;
%Mend;

/*********************************************************************************
 *______________________________________RUN______________________________________*
 *********************************************************************************/

%Annual_Depth(&year_input.)
